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\ Abstract 

C$ | This paper presents a new model of asymmetric bifurcating autoregressive process with 

random coefficients. We couple this model with a Galton- Watson tree to take into account 
possibly missing observations. We propose least-squares estimators for the various parameters 
of the model and prove their consistency, with a convergence rate, and asymptotic normality. 
We use both the bifurcating Markov chain and martingale approaches and derive new results 
JZi ' in both these frameworks. 
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^ ' 1 Introduction 

JO ' In the 80's, Cowan and Staudte [7J introduced Bifurcating Autoregressive processes (BAR) as a 

, parametric model to study cell lineage data. A quantitative characteristic of the cells (e.g. growth 

rate, age at division) is recorded over several generations descended from an initial cell, keeping 
track of the genealogy to study inherited effects. As a cell usually gives birth to two offspring 
by division, such genealogies are naturally structured as binary trees. BAR processes are thus 
, a generalization of autoregressive processes (AR) to this binary tree structure, by modeling each 

line of descent as a first order AR process, allowing the environmental effects on sister cells to 
be correlated. Statistical inference for the parameters of BAR processes has been widely studied, 
either based on the observation of a single tree growing to infinity [TBI I2H1 HE] or on a large 
number of small independent trees [2TJ[T9]. See also |23[l22] for processes indexed by general trees. 

Various extensions of the original model have been proposed, e.g. non gaussian noise sequence 
[5J [21]) higher order AR [501 (23 or moving average AR [TO]. Since 2005, evidence of asymmetry 
in cell division has been established by biologists [2S] and an asymmetric BAR model has been 
introduced by Guyon [T3] where the coefficients of the AR processes of sister cells are allowed to 
be different. This model was further extended to higher order AR [3], to take missing data into 
account [HI [TU] and with parasite infection [JJ . 

To the best of our knowledge, only two papers [5] and [JJ deal with random coefficient BAR 
processes. In the former by Bui and Huggins it is explained that random coefficients BAR pro- 
cesses can account for observations that do not fit the usual BAR model. For instance, the extra 
randomness can model irregularities in nutrient concentrations in the media in which the cells are 
grown. Other evidence for the need of richer models can be found e.g. in |14| . In this paper, 
we propose a new model for random coefficient BAR processes (R-BAR). It is more general than 
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that of Bui and Huggins, as the random variables are not supposed to be Gaussian, they may 
not have moments of all order and correlation between all the sources of randomness are allowed. 
Moreover, we propose an asymmetric model in the continuation of [131 [31 [HI EH E| in the context 
of missing data. Indeed, experimental data are often incomplete and it is important to take this 
phenomenon into account for the inference. As in [111 ||| we model the structure of available data 
by a Galton- Watson tree, instead of a complete binary tree. Our model is close to that developed 
in [3J, but the assumptions on the noise process are different as we allow correlation between the 
two sources of randomness but require higher moments because of the missing data and because 
we do not use a weighted estimator. The main difference is that the model in [3] is fully observed, 
whereas ours allows for missing observations. 

Our approach for the inference of our model is also different from OH]. As we cannot use 
maximum likelihood estimation, we propose modified least squares estimators as in |24| . In [BJ, 
inference is based on an asymptotically infinite number of small replicated trees. Here, as in [3], 
we consider one single tree growing to infinity but our least squares estimator is not weighted. 
The originality of our approach is that it combines the bifurcating Markov chain and martingale 
approaches. Bifurcating Markov chains (BMC) were introduced in [T3] on complete binary trees 
and further developed in [TT] in the context of missing data on Galton- Watson trees. BAR models 
can be seen as a special case of BMC. This interpretation allows us to establish the convergence of 
our estimators. A by-product of our procedure is a new general result for BMC on Galton- Watson 
trees. Indeed, in [131 [UJ the driven noise sequence is assumed to have moments of all order. Here, 
we establish new laws of large numbers for polynomial functions of the BMC where the noise 
sequence only has moments up to a given order. The strong law of large numbers [12] and the 
central limit theorem [151 1161 112| for martingales have been previously used in the context of BAR 
processes [571 HH] and adapted to special cases of martingales on binary trees [31 El EH E] ■ In this 
paper, we establish a general law of large numbers for square-integrable martingales on Galton- 
Watson binary trees. This result is applied to our R-BAR model to obtain sharp convergence rates 
and a quadratic strong law for our estimators. 

The paper is organized as follows. In Section [21 we give the precise definition of our R-BAR 
model on a Galton- Watson tree and state our main assumptions. In Section[3[ we give modified least 
squares estimators and state the convergence results we obtained: consistency with convergence 
rate and asymptotic normality. In Section [4j we recall the BMC framework, prove a new law 
of large numbers under limited moment conditions and apply it to our R-BAR model to derive 
the consistency of our estimators. In Section [3] we establish a new general law of large numbers 
for square-integrable martingales on Galton- Watson trees and use it to derive convergence rates 
and quadratic strong laws for our estimators. In Section [BJ we establish the asymptotic normality 
by using central limit theorems for martingales. Finally in Section [7J we apply our estimation 
procedure to the E. coli data of [2"5] . 



2 Model 

In the sequel, all random variables are defined on the probability state space (ft, A, P). As in 
the previous literature, we use the index 1 for the original cell, and the two offspring of cell k 
are labelled 2k and 2k + 1. Consider the first-order asymmetric random coefficients bifurcating 
autoregressive process (R-BAR) given, for all k > 1, by 

A 2 fe = (b 2 k + T)2n)Xk + (d2k + £2fc), ^ 1) 

A2/C+1 = (62fc+l + 1l2k+l)Xk + {a-2k+l + £2fc+l), 

with the following notations: for all k > 1, 

and 

The initial state X\ is the characteristic of the original ancestor while the sequence (£2fe, f]2k, £2fc+i, V2k+i)n>i 
is the driving noise of the process, and the parameter (a,b,c,d) belongs to M 4 . One can see this 
R-BAR process as a random-coefficient first-order autoregressive process on a binary tree, where 
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each vertex represents an individual or cell, vertex 1 being the original ancestor. For all n > 1, 
denote the n-th generation by 

G„ = {2",2 n + l,...,2 n+1 -l}. 

In particular, Go = {1} is the initial generation and Gi = {2,3} is the first generation of offspring 
from the original ancestor. Finally, denote by 

T„ = |J G t , 

e=o 

the sub-tree of all individuals from the original individual up to the n-th generation and T the 
complete tree. Note that the cardinality |G„| of G„ is 2™ while that of T„ is |T n | = 2 n+1 — 1. In 
all the sequel, we shall use the following hypotheses. 

(H.l) The sequence (s 2 k, V2k, £2k+i, "2fe+i)fe>i is independent and identically distributed. It is 
also independent from X\. 

(H.2) The random variables £2, r/2, £3, ?/3 and X\ have moments of all order up to 47, for some 
7 > 1. The following hypotheses will be used 

E[e 2 ] = EN = 0,E[e|] = E[e|] = a\ > and E[e 2 e 3 ] = Pe, 
E[ m ] = E[ m ] = 0,E[ % 2 ] = E[ V l] =tf>0 and E[ mm ] = p n , 
E[e 2 +j«2+j] = pij, for <G {0, 1}, and p = |(p i + Pio)- 

When dealing with the biological issue of cell lineages, it may happen that a lineage is incomplete. 
Indeed, cells may die or measurements may be impossible or faulty on some cells. Taking into 
account such a phenomenon, we introduce the observation process, (8k)k<Ei- We use the same 
framework as in and not the more general one introduced in [5]. Basically, 8k — 1 if cell k is 
observed, 8k — otherwise. We set 81 = 1 and define the whole sequence through the following 
equalities: 

62k=6 k e k and S 2k +i = (2.2) 
where the sequence = (C°,^)) fcgT is a sequence of independent identically distributed random 
vectors of {0, l} 2 whose common distribution is specified by the following generating function 

Els^s^l = (1 -p -pi -P01) +P0S0 +P1S1 +Pois si- 

We also suppose that the observation process is independent from the state process (-XV,). 

(H.3) The sequence {£ k )keT is independent from (e 2 fe, r]2k, £2jfc+i, V2k+i)keT and from X x . 

Notice that the process {8 k )kej takes its values in {0, 1}, and that if k £ T is such that 8k = 0, then 
^2"fc+i = 0, for all i S {0, . . . , 2' 1 — 1} and all n > 1. So to speak, if individual k is not observed, 
all its descendants are also missing. We now define the sets of observed data 

G; = {k G G„ : S k = 1} and T* = {k e T„ : 5 k = 1} = U? =0 G|. 

Thanks to the i.i.d. property of the sequence (£ fc ), the sequence of cardinalities (|G* |) n >o is a 
Galton- Watson (GW) process with reproduction generating function 

z h-> (1 - po - Pi -poi) + {Pa +Pi)z + Poiz 2 , 
and mean m = 2poi + Po +Pi- The following equalities thus hold (see e.g. |17j ) 

n n+1 1 

E[|G;|]=m" and E[|T*|] = $>[|GJ|] = ^ — ■ 

^— ' m — 1 

1=0 

According to the position of the mean m of the reproduction law with respect to 1, it is well known 
that the population becomes extinct or not. More precisely, if m < 1 then we have extinction almost 
surely, in the sense that P(U„>o{|G*| = 0}) = 1. But if m > 1, there is a positive probability of 
survival of the population: P(n„>o{|G* > 0}) > 0. This latter case is called the super-critical 
case, and we assume that we are in that case. 



3 



(H.4) The mean of the reproduction law is greater than 1: m > 1. 



On the set of non-extinction, the growth of the population is exponential, more precisely there 
exists some non-negative square-integrable random variable W such that 

lim^^VKa.s., and {W > 0} = n„> {|G*| > 0} a.s. (2.3) 

n— too m n 

This immediately entails that 

ITT* I 

lim = W x ——r a.s. (2.4) 

n-too 771™ 771—1 

We will denote by £ the extinction set £ = U„>o{|G* | = 0} and £ its complementary set. Note that 
under assumption (H.4), £ has a positive probability: P(£) > 0. We need one more assumption 
combining the R-BAR and GW processes. 



(H.5) There exist 1 < K < 7 such that 



This is the analogous of the usual stability assumption for the autoregression expressed by max{|6|, |e?|} < 
1 in the case of deterministic coefficients. It ensures that the values of \X k \ do not tend to infinity. 
Note that the assumption above is slightly weaker than the one for deterministic coefficients. In- 
deed, in the fully observed case and when 772 = 773 = 0, this equation reduces to (6 4k + d 4K )/2 < 1. 
The special form of this assumption is explained in Section B~2l and is closely linked to the properties 
of the R-BAR process as a bifurcating Markov chain. 

Finally, denote by F = (J- n ) the natural filtration of the R-BAR process (Xk)k£T, which means 
that J- n is the u-algebra generated by all individuals up to the 71-th generation, T n = <7{X k , k £ 
T n }. We also introduce the sigma field O = a{8k, k G T} generated by the observation process. 
We shall assume that all the history of the observation process (Sk) is known at time and use the 
filtration F° = defined for all 71 by 

T° = O V a{6 k X k , k e T n } =OV a{X k , k E T* n }. 

Note that is a sub-cr-field of O V !F n . 



3 Estimation 

We now give some least-squares estimators of our parameters and state our main results on their 
asymptotic behavior. 

3.1 Estimators 

We propose to use the standard least-squares (LS) estimator 6 n = (a„, b n ,c n , d n )* of 6 — (a, 6, c, d) 1 
which minimizes the following expression 

A n(8) = \ 5 2 k{X 2k - a - bX k ) 2 + 5 2 k+i{X 2k+1 - c - dX k ) 2 . 

Consequently, for all 71 > 1 the following equality holds 



fceT„_i 



/ S 2 kX 2k \ 
S 2 kXkX 2k 
S 2 k+iX 2 k+i 
\ 8 2 k+iXkX 2k +i ) 



with S n -i 



S° n -! 





and = EfceT,-! 6 



2k 



1 Xk 



2 ) ) &n-l — SfcGT„_i ^ 



2fc+l 



1 x k 

Xk xl 
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We now turn to the estimation of the parameters of the conditional covariance of (e 2 , ??2, £3, 
Following [21], we obtain a modified least squares estimator of cr = (erf, poo, Pu, °^)* by minimizing 



n— 1 

where for all fc G G„, 



On2 



«=i feet 



C2fe — 52k(X2k — a-n — b n Xk), 
£2k+i = <5 2 fc(X 2 fe+i — c„ — d n Xk). 

Under assumptions (H.2) and (H.3), one obtains the following estimator 



£2fc 
£2fe+l 



fakfak + V2kX k ), 
S2k+l{£2k+l + V2k+lXk), 



-2k 



€ 2k+l ' 2 ^fe^2fe ! 2 ^fe^2fc+l > X\ (e 



2fc "r c 2fc+ 



O) 1 



(3.1) 



where 



fcGT„ 



^2fc + ^2fc+i 252kXk 2S 2 k+iXk {62k + <$2fc+i)-Xfc \ 

2<5 2fc X fc 4<5 2fe Jr fc 2 2<5 2fe X£ 

2(5 2 fc +i Xfe 4<5 2 fe + iX^ 2(5 2/ t + iX| 

V (fc* + W)*k 2<5 2fc A\ 3 2<5 2fc+ l*fc (5 2 fe + W)^fe7 



Note that if cr,^ = the estimator of o\ above corresponds to the empirical estimator already used 
in [5]. Similarly, the least-squares estimator of p = (p e ,p,p n Y minimizes 



and one obtains 



where 



n— 1 

A» = 2 E E ( ? ^ 2fc+ i - ne 2k e 2k+1 \ Tf ]) 2 , 

£=1 fcgG £ 

Pn = E (^2fe£2fe+l, 2Xfc£ 2 fee 2 fc + i, X^e 2 fe? 2 fe + i) 



(3.2) 



fcex,, 



1 2X k 



XI 



V n = fefe W 2X k AX 2 k 2X 



fceT„ 
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X 2 k 2X1 4 



Note that one cannot identify poi from pio, hence the use of p = (poi + Pw)/2. Again if cr 2 , = 0, 
we retrieve the empirical estimator of p e used in [9]. 

3.2 Main results 

We now state our main results. The first one establishes the consistency of our estimators on the 
non-extinction set. 

Theorem 3.1 Under assumptions (H.l-5), and if K > 2, the following convergence holds 

lim 3U|G*|>o}0n = 61 t a.s. 
and if in addition K > 4 then the following convergences also hold 

lim 1{| G *| >0 }£T„ = <x% a.s., lim ln G ,i >0 xp n = p% a.s. 
The next results give convergence rates for the estimators. 

Theorem 3.2 Under assumptions (H.l-5) and if k > 4, for all S > 1/2, the following convergence 
rate holds 

\\6 n ~6\\ 2 = o{n s m- n ) a.s. 
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with the quadratic strong law 

1 n 

lim 1 { | G .|>0}- E W-il'^Oi ~ BfSH^SiOi -0)= ^(rS" 1 )^ 



n 



where S , T and S are 4x4 matrices defined respectively in Proposition \4-14\ Lemma \5.4\ and 
Lemma \5.5\ 



For all n, set 



£ 2k + £ 2fe+l \ 



Theorem 3.3 Under assumptions (H.l-5) and if k > 8, £/ie following convergences hold 



lim l/IR-l^fUTn = crl-j 
n— »oo 



to 1{|g«|>o}— = [/ _1 (go(0) + gi(0), 2g (l), 2?i(l), go(2) + ?i(2))* % o.«. 

where U is a 4 x 4 matrix defined in Proposition \4-14\ and the qi(r) are scalars defined in Lem- 
mas EZ3 [553 EH anrfE23 

Theorem 3.4 Under assumptions (H.l-5) and if K>8, the following convergences hold 

lim l { | G .| >0} p„ = p% a.s. 

n— >oo 

IT* ^ I t 

lim 1{|g»|>o} " -1 (P n ~Pn) = ^~ 1 (goi(Q) i 2goi(l),goi(2)) % a.s. 

where V is a 3 x 3 matrix defined in Proposition \4-lJ\ and the goi(r) are scalars defined in Lem- 
mas WmWm and\5lM 

We now turn to the asymptotic normality for all our estimators 6 n , <x ra and p n given the non- 
extinction of the underlying Galton- Watson process. For this, using the fact that P(£) ^ thanks 
to the super-criticality assumption (H.4). we define the probability P^ on (Q, A) by 

= nAn£) for all A A 

Theorem 3.5 Under assumptions (H.l-5) and if k > 4, the following central limit theorem holds 

\T; i _ 1 \ 1/2 (9 n -9)^N(0,S- 1 TS- 1 ) on (E,¥g) (3.3) 
with S defined in Proposition \4- lj\ and T in Lemma \5.4\ If moreover k > 8, 

IK^ian-^-^Af^U-^U- 1 ) on (S,¥g), (3.4) 
K-i\ 1/2 (Pn- P) ^ m^rry- 1 ) on (S,¥g), (3.5) 



with U and V defined in Proposition \4-14\ and r CT and T p defined in Eq. i6.1]) and 
The proofs of these theorems are detailed in the next sections. 
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4 Bifurcating Markov chains and consistency 



In order to investigate the convergence of our estimators, we need laws of large numbers for quan- 
tities such as (52fc+i^fe^2fe^2fe+i)feeT- To obtain them, we use the bifurcating Markov chain 
framework introduced by J. Guyon in [TH] and adapted to Galton- Watson trees by J.-F. Delmas 
and L. Marsalle in [IT]. We first recall the general framework, then prove the ergodicity of the 
induced Markov chain and finally derive strong laws of large numbers. We conclude this section by 
establishing the strong consistency of our estimators. Note that we cannot directly use the results 
in because our noise sequences do not have moments of all order. Therefore, our first step is 
to provide a general result for bifurcating Markov chains on GW trees with only a finite number 
of moments. 

4.1 Bifurcating Markov chain 

Let B be the Borel cr-field of R, and W be the Borel cr-field of W, for p > 1. We add a cemetery 
point d to R, denote by R the set R U {d}, and by B the cr-field generated by B and {d}. This 
cemetery point models the state of a non-observed cell. We recall the following definitions from 



Definition 4.1 We call T* -transition probability any mapping P from R x B onto [0, 1] such that 

2 

• P(-, A) is measurable for all A in B , 

• P(x, •) is probability measure on (R 2 ,fi 2 ) for all x in R, 



For any measurable function / from R onto R, one defines the measurable function Pf from R 



provided the integral is well defined. Let v be a probability measure on R. In the sequel, v will 
denote the distribution of X\. 

Definition 4.2 We say that (Z„)„ g T is a bifurcating Markov chain with initial distribution v and 
T* -transition probability P, a P-BMC in short, if 

• Z\ has distribution u, 

• for all n in N, and for all family of measurable bounded functions (/fc)fceG„ on ^ 2 ' > 



As explained in [T3], this means that given the first n generations T„, one builds generation G n +i by 
drawing 2™ independent couples {Zik, %2k+i) according to P(Zk, ■), k € G n . This also means that 
any couple (Zih-, ^2fe+i) depends on past generations only through its mother Z^. The assumption 
P(d, {(d,d)}) = 1 means that d is an absorbing state, and this hypothesis corresponds to the fact 
that a cell that is not observed cannot give birth to an observed one. We also assume that P(x, R 2 ), 
P(x, R x {d}) and P(x, {d} x R) do not depend on iel. The P-BMC is thus said to be spatially 
homogeneous. Such a spatially homogeneous P-BMC with an absorbing cemetery state is called a 
bifurcating Markov chain on a Galton- Watson tree, see [11] for details. 

Now let us turn back to our observed R-BAR process. In order to use the framework of P- 
BMC's, we define the auxilliary process (X*) n£ j by 



CD. 



. p(d,{(d,d)}) = i. 



onto R by 




E [ f k (Z 2k , Z 2k+1 ) a(Zj,jET n ) = ] [ Pf k (Z k ). 




(4.1) 
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which means that X* = X n if cell n is observed, X* — d the cemetery state otherwise. It is clear 
from assumptions (H.l) and (H.3) that the process (X*) n£ i is a P-BMC on a GW tree with 

3 

T*-transition probability given for all x in R and all measurable non-negative functions / on ffi 
by 

Pf(x) = Pol E [f(x, (b + n 2 )x + a + e 2 ,(d + m)% + £ + £3)] + PoE [f(x, (b + m )x + a + e 2 ,d)] 
+ Pl E [f(x, 8, (d + m )x + c + s 3 )] + (1 - poi -Po- Pi)f(x, d, d), (4.2) 

if x ^ d and Pf(d) = f(d,d,d). As explained in [IS], the asymptotic behavior of the P-BMC is 
driven by that of the induced Markov chain (Y n ) defined on R as follows. 

• For all n > 1, define the sequence (A n , B n ) n >i to be i.i.d. random variables with the same 
distribution as (<i2+c + £2+0 ^2+c + 7 ?2+c) J where £ is a Bernoulli random variable with mean 
(poi +Pi)/m independent from (e 2 , %> £3, 

• Then, set Yq = X{ = X\ and define Y n +\ recursively by 

Yn+i = A n+ i + B n+ iY n . (4-3) 



The sequence (F n )neN is clearly an R- valued Markov chain with transition kernel given for all x in 
R and A in B by 

Q(x,A) = P ^ A)+P ^ A \ (4.4) 

TO 

with Pi(x,A) = (poi + pijE [1a((&2+i + m+i) x + a 2+i + £2+1)] ■ Note that the P Q and Pi are 
sub-probability kernels on (R,B), whereas Q is a proper probability kernel on (R, B). 



4.2 Ergodicity of the induced Markov chain 

We now turn to the investigation of ergodicity for the induced Markov chain (Y n )„ e N. We start 
with some preliminary results on the random variables A\ and P>\. 

Lemma 4.3 Under assumptions (H.2) and (H.5), the random variables A\ and B\ have mo- 
ments of all order up to 4q. In addition, E[log |Pi|] < and for all < s < An, the inequality 
E[|Pi| s ] < 1 holds. 

Proof First, for all < s < 47, the following equalities clearly hold 

E L4i = E [\a+e 2 \ H E c+e 3 , E Pi = E 6+772 H E d+773 

TO TO TO TO 

Hence, under assumption (H.2), it is clear that E[|^4i| s ] and E[|Pi| s ] are finite. Next, notice that 
the function s 1 — > E[|Pi| s ] is convex, that E[|Pi|°] = 1 and E[|Pi| 4k ] < 1 by assumption (H.5). 
This implies that E[|Pi| s ] < 1 for all < s < 4k. Last, consider E[| log |Pi||]: if it is finite, 
E[log|Pi|] is the right-derivative at of s 1 — > E[|Pi| s ], and convexity arguments with assump- 
tion (H.5) yield that E[log |Pi|] < ; if it is infinite, the moment assumption on Pi gives that 
necessarily E[(log |Pi |) + ] < 00 and E[(log |Pi |)~] = 00, so that finally E[log|Pi|] = —00 < 0, as 
expected. □ 

The next result states the existence of an invariant distribution for the Markov chain (Y n ) n izjq. It is 
well known as (Y n ) is a real- valued auto-regressive process with random i.i.d. coefficients satisfying 
Lemma |4~31 see e.g. OH]. 

Lemma 4.4 Under assumptions (H.2) and (H.5), there exists a probability distribution fi on 
(R, B) (which is the distribution of the convergent series = YleLi ■ ' ■Pt-\Ai) such that 

for all continuous bounded functions f on R and all x in R ; the following equality holds 

E x [f(Y n )] ► / fdp=( f i,f). 
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We investigate the moments of the invariant distribution /z to extend the above result to polynomial 
functions. For all s > 1, set \\X\\ S = (E^l 3 ]) 1 / 5 . 

Lemma 4.5 Under assumptions (H.2) and (H.5), \i has moments of all order up to 4k. In 
addition, for all 1 < s < 4k, allxeR and all n 6 N, (E^r^]) 1 / 8 < |ar| + ||Ai|| s /(l-||Bi|| a ) < oo. 

Proof Set 1 < s < 4k. As the sequence (A n ,B n ) is i.i.d., the following inequality holds 

oo oo 

E^n 1 ^ = E[\Y / B 1 ---B^ 1 A e \f /s < £ WB.f-^Ail. 

t=i e=i 

Since E[|£?i| ;s ] < 1 and E[|^4i| s ] < oo thanks to Lemma T4.31 the series converges. Now let us turn 
to Y n . The recursive equation (|4.3p yields 

n 

Y n = Y Bi ■ ■ ■ B n + ^ B n ■ ■ ■ B( +1 A(, 
i=i 

with the usual convention that an empty product equals 1. As the sequence (A n ,B n ) is i.i.d., Y n 
also has the same distribution (under P^) as 



xB 1 ---B n + ^T l B 1 ---B i ,_ l A i , 

i=i 

so that, for 1 < s < 4k, the following inequality holds 

n 

M^H 1/S < \x\\\BiE + £ WBif-'WArl < \x\ 

hence the result. 



\Ai\ 



(=i 



1- Si 



(4.5) 



□ 



The next result is a direct consequence of Lemma 14.51 

Corollary 4.6 Under assumptions (H.2) and (H.5), all polynomial functions f of degree less 
than 4k are in Li(fi): (fi, \ f\) ~ E^^Yoo)!] < oo. 

We state a technical domination result that will be useful in the next section. 

Lemma 4.7 Under assumptions (H.2) and (H.5), for all polynomial functions f of degree less 
than 2q with q < 2k ; there exists a nonnegative polynomial function g of degree less than 2q such 
that for all n £ N and all 

Ex[/0"n)] <g(x). 



Proof It is sufficient to prove the result for f(x) = x p with p < 2q. For p > 1, Lemma 1431 yields 



< 



i-ll^illp 



< 2 P 



_1 (N' 



l^lll? 



(1-\\Bi\\ p )p 



If p is even, we set g(x) = 2 P 1 (x p + ||^4.i|||J/(l — |£?i|| p ) p J, and if p is odd, we set g(x) 
2 p - 1 ( y x p+1 + 1 + ||Ai||p/(1 - UBillp)^, as for all x e R, \x\ p < x p+1 + 1. Notice that if p is odd 



and p < 2q, one also has p + 1 < 2q, hence the result. 



□ 



Finally, we prove the geometric ergodicity of the induced Markov chain for polynomial functions. 

Lemma 4.8 Under assumptions (H.l-2) and (H.5), for all polynomial functions f of degree less 
than 2q with q < 2k, there exists a nonnegative polynomial function g of degree less than 2q and a 
positive constant c such that for all n G N and all the following inequalities hold 

E x [f(Y n )}-(»,f) <«?(*) USxIl^, E„[/(K„)] - (fi,f) < C ]|Bi||2«. 
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Proof Without loss of generality, it is sufficient to prove the result for polynomials / of the form 
x p with 1 < p < 2q. Holder's inequality yields 



E x [/(r„)]-(/x,/) 



p-i 



s=0 



p-1 



s=0 



We are going to study the two terms above separately. For the first term, Eq. (|4.5p and the 
definition of yield 



1/p 



< M\Bi 

< (\x\ + 



E[\xB 1 ---B n - Y, B 1 ---B^ 1 A t \P] 

t=n+l 

Ai\\ P " 



i/p 



MiIIp 



l-\\Bx\\ P 



\Bi\\2 K , 



by Lemma 14.31 as p < 4k by assumption. We now turn to the second term. Holder's inequality 
with a = (p — l)/s and /3 = (p — l)/(p — 1 — s) yields 

s/(p-l) f n a) I 

EIIYO.H 



(^[ly^- 1 -!^]) < (E K [|y„n 

< (N + I l|Al||p 



- 1 oo Hp 



- Il-Billp 

this last upper bound coming from Lemma 14.51 Finally, one obtains 

I^iIIp 



ExK-y^: 



p-i 

< ll^illLE 1 



s=0 



l-ll^ll 



yooiir 1- ' < \\BAl„9{x\ 



where g is a polynomial function of degree at most 2q by a similar argument as in the previous 
proof. Integrating this upper bound with respect to the initial law v and using (H.l) gives the 
second result. □ 



4.3 Laws of large numbers for the P-BMC 

We now want to prove laws of large numbers for a family of functionals of the P-BMC (X*). We 
are interested in polynomial functions on R and R 3 multiplied by indicators. Precisely, for all 

3 

q > 1 j let F q and G q be the vector spaces generated by the following class of functions from R to 
R and from R to R respectively, 

{x a yH R (y), x a z T Mz) 1 x a y^z T l K 2 (y, z), < a + p + r < q}, 

{x a l R (x), 0<a<q}, 
where a, (3, r are integers. We first establish some technical results needed in the main proof. 

Lemma 4.9 Let f G F q and h G G q . Under assumption (H.2), 

(i) ifq< 4% / 6 L\P) and Pf G G q> 

(ii) if q < 47, h G L 1 (P ,P 1 ,Q) and P h,Pih and Qh G G 9 , 
(raj ifq<2-f, h ® h e L 1 (P) and P(h <8> h) E G 2q - 
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Proof Take q < 4-f and p < 2j and remark that Pf(d) — for any / G F q , so that Pf{x) = 
Pf(x)l R (x) for all x G R. Next, take fx(x,y,z) = x a y p z T l R 2(y, z) and f 2 = x s y e ls.(y) in F q , 
h{x) = x p 1r(cc) in G q , and l(y, z) = y l ® z 3 1r2 (y, z) with i, j < p. Eq. (|4.2p yields, for i G {0, 1}, 

P|/i|(z) = Pol \x\ a E[\(b + i l2 )x + a + e2f\(d + f] 3 )x + c + e3\ T ], 

P\h\{x) = {p Q1 +p Q )\x\ 5 E[\(b + r] 2 )x + a + e2\ £ ], 

Pi\h\(x) = {p i+Pi)E[\(b 2+ i + r]2+i)x + a2+i+S2+i\ P ], 

P\l\(x) = p iE[|(6 + J72)a; + o + £2n(d + J?3)a; + c + e3H. 



Assumption (H.2) entails that the 47-th moments of ((& + 772)2; + a + £2) and ((d + 773)0; + c + £3) 
are finite, which gives the integrability results, since (3 + T<q, e<q,p<q and i + j < 2p. The 
integrability results are thus proved. It is then obvious that Pl(x) is computed the same way as 
Pfi(x), and Pih(x) the same way as Pf2(x). But 



r=0 s=0 



P/ 2 (z) = (poi+Po)J2 Cr M( b + V2) r (a + e2y- r ] 



x r+s . 



r=0 



so that Pfi , P/2 and P^/i are in G g , since a + (3 + t < q, S + e < q and p < q. As for PZ, it belongs 
to G2 P , since z + j < 2p. □ 

We are now ready to prove the main result of this section. 

Theorem 4.10 Under assumptions (H.l-5), for all function f G F K , the following law of large 
numbers holds 

lim A E f(X* k ,X* 2k ,X* 2k+1 ) = (fx,Pf)W a.s. 

fceG* 

Proof: This result is similar to Theorem 11 of [T3] and Theorem 3.1 of [11] . The proof follows 
essentially the same lines and is thus shortened here, the main difference being that the class of 
functions F K does not satisfy assumptions (i)-(vi) from |13[ 111] mainly because F K is not stable by 
multiplication and (£2, 772, £3, 773) do not have moments of all order. 

For all / in F K , Pf is well-defined from R onto R thanks to Lemma 1431 as n < 7. As Pf(d) = 0, 
by a slight abuse of notation we will also denote Pf its restriction to R. Thus, Pf is /_t-integrable 
by Lemma 1431 One has 

m- n J2 f(x* k ,x; k ,x; k+1 ) - (j*,pf)w = ^ E (f{x* k ,x* 2k ,x* 2k+1 ) - ( M ,p/)) + (^Pf)( l - 



By Eq. (j2.3p the second term converges to a.s. as n tends to infinity. In order to prove the a.s. 
convergence of the first term, as in |13| 111], it is sufficient to prove that 

£m- 2 "E[( ]T g {XlX* 2k ,X* 2k+1 )f] < 00, (4.6) 

n>0 fceG* 

with g = f — (/i, Pf) G F K . Thanks to Lemma [4.91 Pg G G K , and as g 2 G P2 K , one also has 
Pg 2 G G 2k . The expectation inside the sum decomposes as 



e[( Y, g(xt,x* 2kl x* 2k+1 )) 2 ] = e[( J2 Pg(x* k )) 2 } +e[ ]T ( p v 2 (P9) 2 ) «)] = c n + d„ 

fceG* feeG* fceG* 

We study the two terms C n and D n separately. Let us first prove that J2 n >o m ~ 2n F> n < oo. We 
can rewrite D n = E[J2 k£G ' h ( X k)] with h = P S 2 ~ ( p ff) 2 - As 

seen above, h G G 2k and therefore 
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h is /i-integrable thanks to Lemma 14.51 To investigate the limit of Y] m 2n D n , we prove that 
m~ n D n has a finite limit. More precisely, the following inequality holds 

\\m- n h(X* k )-(»,h)W\\ 2 = \\m- n J2 (HX* k )-(n,h)) + ([,,h)(m- n \G* n \-W)\\ 2 
fceG* fceG* 

< ||m-" (h(X* k )-(n,h))\\ 2 + \(fx,h}\ ||m-"|G;|-W|| 2 . 
fceG* 

The second term converges to zero. For the first term, again let I := h— (/i, h) G G 2k and (fi, I) = 0, 
and by equation (15) p 2504], the following equality holds 

n-l 

\\m- n J2 (HX*k)-(^h))\\i = m- n E4l\Y n )} + 2?7i- 2 Y,™~Hv,Q e nQ n ~ e ~ 1 l®Q n - e ^) 

fceG* e=o 

Concerning the first term in Eq. ()4.7|) . as I 2 G Ga k , by Lemma |4~8"1 one obtains limn^oo E„[l 2 (Y n )} = 
(/x, Z 2 ) and m~ n E[l 2 {Y n )] converges to a.s. Concerning the second term in Eq. (|4.7p . Lemma |4~51 
yields lim^oo Q n ~ e ~ 1 l(x) = limn^o, E x [l(Y n _e_x] = (fj.,1) = and by Lemma |4~7] Q n - T ~H is 
dominated by some <fi G G 2k . Moreover, using Lemma 1431 <fi® <fi belongs to P4 K , it is P-integrable 
and P(4> <g> </>) belongs to G^ K . By Lemma T4. 71 Q l P{4> ® 4>) is dominated by some i/j G G4 K , which 
is i^-integrable by assumption (H.2). Lebesgue dominated convergence theorem yields 

lim (u, Q l P{Q n ~ l -H <g> Q n - l -H)) = 0, 

n— >oo 

and \(u, Q t P{Q n - l ~ 1 l ® Q n ~ l ~H))\ < (v, tp). This upper bound allows us to deal with the limit of 
the second term of Eq. (|4.7p . Under assumption (H.4), Yle=o 171 1 converges and for e > 0, there 
exists £ e such that y? * m~ i {v, ip) < e. Finally, for n > £ e , we obtain 

n-l e e -i 

I m ~ l ( v > Q l P{Q n - l ~ l l ® Q""'" 1 /)) I < E m ~' I ^ Q l P{Q n ~ l ~^ ® Q n ~ e ^i)) I + e, 

All the terms of the left sum converge to with n, which finally proves the /^-convergence 
of m~ n X^fceG* ^P^fc) ^° (/A^W- This implies the convergence of the expectation m~ n D n to 
(/x, /i)E[l4 7 ] (recall that VT is square- integrable) . Therefore, one obtains Y2 n y a m^ 2n D n < oo be- 
cause m > 1. 

Let us now prove that J2 n >o m ~ 2n C n < oo. Recall that g G F K , (fj,,Pg) = and following 
Eq. (15) p. 2504 of [TT|, we obtain a new expression for C n : 

-% = ll4rEWfc)H2 (4-8) 

fceG* 

1 „ uu tJnn , , 2 ^ M t P{Q n - l - 1 {Pg)®Q n - l -HPg))) 



EA(P 9 nY n )} + — Y: 



1=0 

The proof of the convergence of the first term of Eq. (|4.8|) is the same as the one of E„ |7 2 (F g )], and 
^2 n>0 ni~ n E[(Pg) 2 (Y n )] converges. For the second term, setting p = n — I — 1, we can rewrite 

n-l 

£ £ m^i/, Q e P(Q n - e -\Pg) ® Q^-^Pa))) = £ m"V, Q £ p( ^(Q p (P.g) ® Q p (P.g))) ) 

n>0 £=0 f >0 p>0 

By LemmaEd there exists ip G G K+ i, such that |E a; [(Pg)(F p )]| = |QP(Pg)(x)| < (/j(a;)]|Pi||^ and 
therefore the following inequality holds 

i E w p ( p 5) ® Q p ( p -9))i ^ (v ® E H^iii.- 

By assumption (H.5), the series converges and there only remains to study the asymptotic behavior 
of J2e>o m ~ e ( l/ iQ e P( L P ® f))- For this, let us remark that (v,Q e P((p ® ^)) = E„[P(v? ® ¥?)(V<)] 
with P(tp<8>(p) G G2 K +2- By Lemma |4~81 lim^oo E,y[P((p®(p)(Yg)] is finite and the series converges 
because m > 1. We have thus proved that Eq. f|4.0[) holds, and hence the almost sure convergence 
of the series m"" EfeG* K X h^ to (/x, P/)VF. □ 
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4.4 Laws of large numbers for the R-BAR process 

Let us now turn back to our R-BAR process and see how the law of large numbers given by 
Theorem 14.101 applies to our process. 

Proposition 4.11 Under assumptions (H.l-5) ; for all integers < q < K, and all i E {0, 1} ; the 

following laws of large numbers hold 



lim 1{|g*|>o}tJtt Y S ?k +l X q k = 40?)% a.s. 
lim 1{|g* |>o} pTnJT Y S 2khk+iX q k = 4i(<7)% a.s. 



1 fcGT* 

withli(q) = (poi + Pi )E[Y£\ ande 01 (q) = p 01 E[Y£\. 

Proof Set q < k. We apply Theorem 14.101 to the function fo(x,y,z) = x q tg.(y) if i = and 
fi(x,y,z) = x q t]g L (z) if i = 1 for the first limit, and /oi(x, y, z) = x q 1^2(y, z) for the second 
limit. The functions /o, f\ and /oi clearly belong to F K , and moreover Pfi(x) = (poi + Pi)x q , 
Pf Q1 (x) =p ix q . Finally, notice that (p,x q ) = K[Y^]. Theorem OS thus yields 

lim \ Y hk+iXl = ii(q)W, lim -L y S 2k 6 2k+1 X q = £ Q1 (q)W a.s. 

fceG* fceG* 

Now, for instance, the following decomposition holds 

iim \ y 5 ^ x l = Y-^i{—i Y w^iY 

feeT* £=0 fceGJ 

The sum above converges to li(q)Wm/(m — 1) thanks to Lemma A. 3 of [3] and we conclude using 
Eq. (E31). □ 

Proposition 4.12 Under assumptions (H.l-5) ; for all integers < q < K — 1, and all i G {0, 1}, 

the following almost sure convergences hold 



lim ^"'^ £ S 2k+l X q X 2k+t = (poi +p i )( a3+< Ery«] + 6 2+i E[y c J+ 1 ])l F , 

I n— 1 



|>0} 

fcST* 



fe £ T n-l 

anc! if n > 2, for all integers < q < n — 2, the following almost sure convergences holds 



ljm^tRl^ iT* ^ (5 2 fe +J A«A| fe+ 



= (poi + ft)((a 2 +i + o2)E[y£] + 2(a 2+J 6 2+i + + Q> 2 2+i + ^)E[y^+ 2 ])%, 

lim 1{|g* I >o} Tnr* — I Y S 2k 8 2k+ iXlX 2k X 2k+ i 



ra-1 I 



fc€T; 



= P01 ((ac + p £ )E[Y£] + (ad + bc+ 2p)E[Y q c + 1 } + (bd + /^E^ 2 ]) %. 
Proof The proof follows the same lines as that of Proposition 14. Ill □ 
We end this section by stating how to compute the moments of the invariant law fi. 



13 



Lemma 4.13 Under assumptions (H.2) and (H.5), the first moments ofY^ are 
m , = E[A,] E[Al] + 2E[A lBl nY x ] 



l-E[Bi]' 1 °° J l-E[Bf] 

and more generally, the moments of can be calculated recursively for all 1 < q < 4k thanks to 
the relation E[Y&] = J^Uo C^E[Al~ s B(]E[Y^\ . 

Proof As Yoo is the stationary solution of equation Y n = A n + B n Y n _i, Yrx, has the same law 
as A + BoYoo where (A ,B ) is a copy of {A\,B\) independent from the S6CJ116I1C6 (_A n , -Sn)n>l ■ 
Hence, we can write E^] = E[A + Bq^oo] = EL4i] + E^E^o,-,]. Similarly, one has 



E[Yl] = E[(A + BoY^) 2 } = E[A 2 ] + 2E[ J 4 1 B 1 ]E[F oc ] + E[B 2 ]E[Y C 



The general formula is obtained in the same way by developing the relation E[Y^] = E[(Aq + 

Note that one can easily compute the moments of A\ and B\ from their definition. In particular, 
the two following equalities hold 

= am + cmi 
00 1 — biriQ — dm\ ' 

and 

r 2 i _ Q 2 ™q + c 2 wi + gg + 2((ab + Poq)toq + (cri + pii)rm)E[Y^\ 
[ °° J ~ 1 - (b 2 m + d? mi + o*) ' 

with m = (poi + Po)/m and mi = (p i +Pi)/m. 
4.5 Consistency of the estimators 

We are now able to prove the consistency of our estimators. We start with the computation of the 
limits of the normalizing matrices S n , U n and V n , which is a direct consequence of Proposition !?.!!! 



Proposition 4.14 Under assumptions (H.l-5), and if k > 2, for i G {0, 1} ; the following laws of 
large numbers hold 



lim l { | G ;|>o}|fc = S%= (poi + Pl ) ( ^ 



Um 1 {|G , |>0} |^ = S%=( 5 Q £ 



// m addition k > 4, i/ie following convergences hold 



lim 1j 



M|G*|>0}77^: 



[/It 



/ 1 2m E[Y oc ] 2m 1 E[Y oa } E[Y^} \ 

2m E[Y oo ] 4 ? 7i E[y^] 2m E[Y£] 



2m 1 E[Y cx 



4miE[F^] 2m x E[Y£] 



a.s. 



V E\Y*] 2m E[Yl} 2 mi E[Yl] E[Y^} J 



lim 1 



{|G* j>0} 



T* 



1 2E[Y 00 ] E[Y^] 
Vlj = p i\ 2E[Y oc ] 4E[Yl] 2E[Y^ 
E\Y*] 2E[F3] E^; 



Besides, the matrices S l , U and V are invertible. 
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We now turn to the consistency of our main estimators. 



*J>o} 



lira 

n-S-oo \T n _ 1 



S n -iO n = m 



lj = SOtj a.s. 



Proof of Theorem 13. II As regards our main estimator 9 n , a direct application of Proposition l4.12l 
yields 

( m (a + bElY x ]) \ 
m (aE[y oo ] +bE[Yl}) 
miO+dEfFoo]) 
\ m^cElY^} + dE[Y£}) j 

and the result follows from Proposition 14. 141 and the definition of 6 n . The consistency of a n and p n 
is a bit more complicated because their definition involves the e k . We give a detailed proof of the 
convergence of |T*_j | _1 ^2 ^Ifcj the other terms in U n -i& n and V n —ip n being treated similarly. 
For k £ G„, one can develop 

e 2k = S 2k (X 2k — a n — b n X k ) 2 

= S 2 k(a n + 2a n b n X k + b l n X\ - 2a n X 2 k ~ 2b n X k X 2k + X 2k ). 
Hence, the following equality holds 

n—1 n—1 n—1 

E ? 2fc = E °% E 62k + 2 E ^ E &2kXk + E^ E 



(4.9) 



keT 



i=i feet 

n-l 



t=i fcec 

n-l 



i=l k£G e 



—2 E a£ E $2kX 2k - 2 E be E ^X k X 2k + 

E fa*-. 

t=i keGi i=i keG e fe e T n-i 

The limit of the last term is given by Proposition 14. 1 21 The first term decomposes as 



2k- 



n-l 



n-l 



2k- 



i^tE^ E 52 * = E S '^T^/ E 5: 

1=1 fceGf fcl fceGf 

We apply Lemma A. 3 of [3] to the sequence above. On the one hand, 

lim aj-^-j V* 5 2k = a 2 (p i + p a )W a.s. 

feeGc 

thanks to the previous result on the consistency of 9 n and Theorem 14.101 On the other hand, 
the series ^m~" converges to m/(m — 1) under assumption (H.4). Therefore, Lemma A. 3 of [3] 
yields 



n— kxj J71 

and Eq. (|2.4|) finally yields 



lim — —[ E aj V 5 2 fc 



1=1 feet 



771 — 1 



a 2 (p i+p )W a.s. 



^ n — 1 

lim 1{|g*|>o} T7p — f E^ E §2k = fl2 ^ 01 +Po)% a - s - 

1 n-l I l=i k€G e 

Note that the limit above is just the limit of af multiplied by the limit of |T*_ X j" 1 $2k- The 
other terms in Eq. (|4.9p are dealt with similarly using the results of Proposition 14.121 Finally, one 
obtains the almost sure convergences 



,. 1 {|G*|>0}, 7 - 



4|G*|>0} 



lim 

n^oo |TT*_ 1 | 



E 

fceT„_! 



lim 



1{|G*|>0}1 



n^oc |t;_ x | 



lim 

n— >oo nr 



{|G*|>0} 



1 



e 2k + £ 2fc+l \ 
2Xfe?2fe 

V X k( e 2k + e 2k+l) ) 



C/erl 



E 2X k ? 2k ? 2k+1 = Vplj, 



feeT n _! \ Xl^ 2k e 2k+1 



hence the result using Proposition 14.141 



□ 
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5 Martingales and convergence rate 



The aim of this section is to obtain sharper convergence results for our estimators, namely rates of 
convergence. The P-BMC approach does not allow this, therefore we now use martingale theory 
instead, as in E]. However, we cannot directly apply the results therein mainly because our 
noise sequence (e& = £k + T]kX\k/2]) now contains the BAR process (X^) and thus does not satisfy 
the assumptions of [3J[H]- 

5.1 Martingales on binary trees 

We start with a general result of convergence for martingales on a Galton- Watson binary tree, that 
we will make repeatedly use of in the following sections. Special cases of this result have already 
been proved and used in [3] and [5]. Note that in this binary tree context, we cannot use the 
standard asymptotic theory for vector martingales (see e.g. [12]) because the number of data is 
roughly multiplied by m at each generation. 

Theorem 5.1 Let (Af n ) be a p- dimensional ¥° -martingale on the GW-binary tree T* : M n = 
Y^e=i SfeeG* with W f = wf., . . . , wf,)* . We make the following assumptions 

(A.l) (M n ) is square-integrable. 

Let < M > n = 2"=o ^ ^ e ^ e predictable quadratic variation of (M n ), with 

r„ = E[AM„ +1 AMl +1 | J%]. 

(A. 2) |T*_ 1 | _ < M > n converges almost surely to a positive semi-definite matrix T on £. 
(A. 3) The p x p ¥° -matrix martingale (K n ) defined by 

n 

K n = \nr\^M e+1 AMl +1 - ¥[AM l+1 AM\ +l | J?]) 

i=i 

is square-integrable and its component-wise predictable quadratic variationes are 0{n) a.s. 
on £. 

Let (E n ) be a sequence of p x p invertible symmetric matrices such that 
(A. 4) |T* | S n converges a.s. to an invertible matrix S on £; 

(A. 5) there exists a positive constant a such that on the non- extinction set £ and for all n and 
the following assumptions holds a(S~_j — E^ 1 ) > S^TnS" 1 , in the sense that a(S~_j — 
E" 1 ) — E~ r n E~ is a positive semi-definite matrix. 

Then M t n H~ i L 1 M n = 0(n) and if a is positive definite \\M n \\ 2 = 0(nm n ) a.s. 
If in addition, the entries of (M n ) satisfy 

(A. 6) sup n E[(7«~' 1 / 2 X^fceG* w k) 4 I -^n-i] < 00 almost surely, 

then for all S > 1/2, \\M n \\ 2 = o(n s m n ) a.s. and 

1 " 

lim 1{|G'|>0}- y^M^SJ^Mt = tr{TS- 1 )l J a.s. 

n—too n n * — ' 
1=1 

Proof of the first part of Theorem I5.lt The result is obvious on the extinction set £. In 
the sequel, let us suppose that we are on the non-extinction set £. For all n > 1, denote V n = 
M^E^Mn. The following equalities hold 

Vn+i = M* l+1 E- 1 M„ +1 = (M n + AMn+ifE-^Mn + AM n+1 ), 

= V„ -M^S-^ -S- 1 )M n + 2M t n S- 1 AM n+1 +AM t n+1 Sf l 1 AM n+1 , 
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since M t n S n 1 AM n+ i is scalar, and hence equal to its own transpose. By summing over the 
identity above, we obtain 

V n+ i + An = Vi + B„+i + W n+ i, (5.1) 

where 

n n n 

An = Y i M t e (Sj\-Ej 1 )M i , B n+1 = 2^M*H7 1 AM, +1 , W n+1 = &M\ +l E.J l 

e=i c=i e=i 

The asymptotic behavior of the sequences (W„) and (£>„) is given in the following lemmas. 
Lemma 5.2 Under assumptions (A.l) to (A. 4), the following almost sure convergence holds 

lim 1{|g*|>o}-W„ = — -triTE- 1 )!? a.s. 

?i->oo 71 rt m 

Lemma 5.3 Under assumptions (A.l) to (A. 5), the following asymptotic property holds 

B n+ i = o(n) a.s. 

One then obtains 

lim l { | G .|>o} V " +1 +An = — MrB' 1 ^ a.s. (5.2) 

n->oo n ri m 

As (A n ) is a sequence of positive real numbers, it follows that V n +i = 0(n) a.s., which means 
that M t n 'a~ ] _ 1 M n = 0(n) a.s. As H is positive definite, one obtains for large enough n, on the 
non-extinction set £ 

\\M n f = MiM n < Mn ^_^ , 

-^mini" n -l j 

where A m i n (3,7-i) denotes the smallest eigenvalue of '3,^—1- Finally, Assumption (A. 4) yields 
|Af„|| 2 = 0(nm n ) a.s., which completes the proof of the first part of Theorem 15. II □ 



It remains to prove Lemmas 15.21 and 

Proof of Lemma 15. 2\ First of all, we decompose W„+i = T n +i + Tln+i with 

_ " AM; +1 5-'AM w _^AM* +1 (|T||S7 1 -H- 1 )AM, +1 
/n+i-2^ FFH ' '<-n+i - 



IT* ' " T± ^ IT* I 

£=1 1 t> 1=1 1 ^ 

We first prove that lim J ,_ ) . 00 \>o}^T n = II ^"^ r (r3 1 )lg a.s. As 7^ is a scalar and the trace 
is commutative, we can rewrite 7~ n +i = tr(H n+iET 1 ) where 

_ " AM £+1 AM* +1 _ " r e 

tin+i-i^ i T *| -2^|Tr*| +/1 »- 

On the one hand, by assumption (A. 3), one has K n = o(n) a.s. on £. On the other hand, 
Assumption (A. 2) yields 

(<M> t+1 IIJ^kM^ 



{|G t *|>0}| T »| {|G?|>0}^ | T ,| | T »| | T ^j 

so that if„ converges to (T — T/m) 1^ = T(m — l)/m a.s. as £ tends to infinity. Hence, Cesaro 
convergence yields 

lim lri G , i >0 i-JT„ = — r% a.s. (5.3) 
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As a consequence, lim 1{| G * \>o}%i/ n = tr(Ta )(m — l)/mlg a.s. We now turn to the asymptotic 
behavior of H n +i- We know from Assumption (A. 4) that IT^H^T 1 — H _1 goes to as I goes to 
infinity on £. Thus, for all positive e, there exists l t such that if i > £ e , 

IflGjI^jlAMkiClTtyB^ 1 - S-^AM^+i] < 4eAM* +1 AM^ +1 l { | G .| >0} . 

Hence, there exists some positive real number c £ such that, for n > £ € , 

/ ^ am; +1 am (+1 

M\G' n \>0}\K n \ < l{| G *|>0}(4e 2^ 3L {|G||>0} j^j + 



< l {lG , ]>0} [4etr(H n ) + c ( 

This last inequality holding for any positive e and large enough n, the limit given by Equation (|5-3[) 
entails that lim.n_j.oo 1{|g* |>o}^^-n — a - s - which completes the proof. □ 

Proof of Lemma 15. 3t Again, the result is obvious on the extinction set £. Suppose now we are 
on the non-extinction set £. Recall that 



fc=i 



The process (£>„) is a real-valued F -martingale. In addition, the following equality clearly holds 

E[AB 2 n+1 \^} = 4M* l s r ; 1 r„H r ; 1 M„ 

Assumption (A. 5) then yields 

n 

<£>„+!< 4a^M t fc (E^ 1 -Sfc^Mfc = 4aA n . a.s. 
fc=i 

Hence, the law of large number for real martingales yields B n = o(A n ) a.s. Finally, we deduce from 
decomposition (|5.ip and Lemma 15.21 that 

V„+i + An = o(An) + 0(ri) a.s. 

leading to A n = 0(n) and V n +i = 0(n) a.s. as both sequences are non-negative. This implies in 
turn that B n = o(n) a.s. completing the proof. □ 

Proof of the second part of Theorem I5.lt Let us rewrite the entries of the martingale 
M„ as 



Then one just has to apply Wei's lemma given in |26[ p 1672] to the martingale difference sequence 
Ei and U£ = m e / 2 , and for the function f(x) = (log a;) 5 for S > 1/2. Under assumption (A. 6), one 
obtains M« = o(m n/2 n s/2 ). As P« is the q-th. entry of M n , one obtains ||M n || 2 = o(n s m n ) a.s. 
Now recall that V„ = M t ^ r \\ 1 M ni therefore, the following equality holds 



l{|G-|>o}V„ = l { | G .| >0 }M^H„i 1 M„ = o{n 25 ) a.s., 



for all 5 > 1/4. In particular, for 6 = 1/2, and we have the following order V n = o(n). Lemmas 
and 15.31 then yield 

lim 1 { , G .| >0} — = - triTS- 1 )!^ a.s. (5.4) 
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First of all, A n may be rewritten as 

n n 

1=1 e=i 
where A n = I4 — S~ 3 n _i. Thanks to Assumption (A. 4) we know that 

Tfl — 1 

lim l{i G . i>o}A„ = 1 4 % a.s. 

n— Hoo m 

Besides, Eq. (|5.4p yields that 1{|g* |>o}Ai ~ n(m — l)m _1 tr(rH~ 1 )lj- a.s. Plugging these two 
results into the equality 

_ 1 " " _ 1 

gives that 1{|g* |>o} 2™=i M^a^^M^ ~ 1{|g* |>o}«4n TO ( m — a.s. Thus one obtains 

1 " 

lim l{| G *|>o}- Vm^S^M, =ir(rH" 1 )% a.s. 
which is the expected result. □ 



5.2 Rate of convergence for 6 n 

We apply Theorem 15. II to a suitably chosen martingale. Recall that 

n —9 = S n \ 2^ ( e 2fej Afe£ 2 fc, £2fc+l, Afee2fe+l)' = S'n^iAfn, (5.5) 

feeT„„i 

where 

fc£2fc, £2fe+i, Afee2fe+i) • 

Under assumptions (H.l-3), for all n > 0, k 6 G n , E[e 2fc +.; ] = E[X fe e 2 fe +i | ] = and (M„) 
is a square-integrable (J r ^ > )-martingale, so that Assumption (A.l) of Theorem 15.11 holds. Let us 
compute the predictable quadratic variation of (M n ) 

E[AM n+1 AMf, +1 |jf ] = T„ = J2 ~f k ® ( x k xl)> (5 - 6) 



fcGG„ 

where 



6 2k (cr 2 £ + 2X k p 00 + X 2 cr 2 ) 8 2k 8 2k+1 {p e + 2X kP + Xlpn) \ , > 

S 2k S 2k +i (p £ + 2X k p + Xlp v ) <Wi (o-| + 2X kPll + X 2 a 2 ) J' K ' ' 

Thus the predictable quadratic variation of (M n ) is given by 



Lemma 5.4 Under assumptions (H.l-5) and if k > 4 ; £/ie following convergence holds 

<m> ( r° r 01 \ 

n iim 1{|G , |>0} ^_^ = r%=^ r01 pl j% «.*., 

where T° , T 01 and T 1 are the 2x2 matrices defined by 

i _ f o*ti(0) + 2p ii i i {l) + o 2 M2) a 2 Ji(l) + 2p ii £ i (2) + <r^(3) \ 
\ u%{\) + 2 Pii £i(2) + 0^4(3) a%{2) + 2^(3) + o*4(4) J ' 

r oi ^ f Ps4i(0) + 2p4i(l) + /Voi(2) ^oi(l) + 2p4i(2) + p^oi(3) 



p £ 4i(l) + 2/rf i(2) + /Voi (3) Pe4i(2) + 2p^ 01 (3) + p„4i(4) 
7?i addition, T is positive definite. 
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Proof This is a direct consequence of Proposition 14. Ill 



□ 



Hence, Assumption (A. 2) holds if n > 4. The process (K n ) is clearly a square-integrable martin- 
gale if 7 > 2. It is not difficult to check that its component-wise predictable quadratic variation is 
at worst of the order of 

n 1 

Y FF|2 Y 5 2k+iXl. 

Proposition ^. 1 II ensures that |T* | _1 J^keG ^2k+iX k converges almost surely on £ provided k > 4, 
it is therefore bounded by some constant C. As a result, its square is also bounded by C 2 and 
l~ 2 J2keG e ^2fe+jA^ < C 2 a.s. on £ . Finally, one obtains that 

n 1 

|>0} Y ?F|2 Y 5 ^+i X k < C2nl {\G' n |>0}: 
1=1 |Ji ^ k£G e 

so that Assumption (A. 3) holds if k > 4. 

We now introduce a new sequence of matrices S„. They are defined as a standardized version 
of the predictable quadratic variation of (M n ), with the variance coefficients cr 2 and cr 2 set to 1 
and all the covariance coefficients p ei p v , pij set to 0, namely 



£„ = £<M>*= ]T(1 + A 2 ) 



feeT„ 



<5 2 fc 

<52fc+i 



1 A, 

x k xl 



where 4>„ is the 4x2™ matrix of the collection of the 4x1 vectors (l+A|) 1,/2 (<52fc, ^2feAfe, i5 2 fc+i, ^fe+iAfc) 4 
for k e G n = {2™, 2™ + 1, . . . , 2 n+1 - 1} 

/ 



3> 



5 2(2 „ )V /1 + A 2 „ 



<^2(2«) A 2 « vl + A|„ 



\ <52(2") + lA 2 n V /l+A 2 , 



<5 2 (2" + !-l) 
<5 2 (2»+ 1 -l)A 2 «+l_l 

<5 2 ( 2n +i_i) + i 

5aron+l_11 + 1 Aon + li 



1 - 


YXl 


i+i 


-l 


1 - 


YX% 


i+i 


-l 


1 - 


YXl 


i+i 


-l 


1 - 


YX% 


i+i 


-i 



Note that and hence S n is positive definite as soon as the Xk are not constant. The next 

result is again a direct consequence of Proposition 14.111 

Lemma 5.5 Under assumptions (H.l-5) and if k > 4, the following convergence holds 



Hm 1 { | G .| >0} — 



S° 

s 1 



4(0) +4(2) 4(1) +4(3) 
4(1) +4(3) 4(2) +4(4) 



In addition, S is positive 



where is £/ie 2x2 matrix X 4 = 
definite. 

As a result, Assumption (A. 4) also holds if n > 4, and we now turn to Assumption (A. 5). 

Lemma 5.6 Under assumptions (H.l-5), for all a > max{2cr 2 , 2er 2 , pP , p, , v} and for all n, the 
following inequality holds X~ r„S~ < a(E~_i — Yi~ l ), where 

// = \ (o\ + a 2 + ((a 2 - a 2 ) 2 + 4p|) *) , v = a 2 e + a 2 + ((a 2 - a, 2 ) 2 + (poo + Pn?) h ■ 
Proof We first prove that for all such a, T n < af n $^ holds. For all k G G„, let D® = a(l + 



A 2 )-( ( r 2 + 2A fe poo + A 2 ( T 2 ), D\ = a(l + A 2 )-(a 2 + 2A fePll +A 2 ( T 2 ) and Df 



■2X k p+X^ 



be the coefficients of a<I>„<l>^ — T„ up to the sum over G n . We first need to prove that D\ > for 
all k. One can rewrite D l k = a — cr 2 — 2puXk + (a — er 2 )A 2 , so that it is sufficient to prove that this 
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second order polynomial in X k has no real root, as both its terms of degree and 2 are positive 
by assumption. Its discriminant in a function of a given by 

A(a) = -(a 2 - a(a 2 + a 2 ) + a 2 a 2 - p 2 ). 



This discriminant A(a) is again a second order polynomial in a. Therefore it is negative as soon 

W < DlD\. 



as a is larger than its largest root P l . Second, we want to prove that (-D" 1 ) 2 < D®D\. The 



Cauchy-Schwarz inequality yields 

(Dl 1 ) 2 < (a 2 + 2X kPm + X 2 a 2 )(a 2 + 2X kPll + X 2 a 2 ). 
Hence, one just has to check that 

(a 2 + 2X kPoo + X 2 a 2 )(a 2 + 2X kPll + X 2 a 2 ) < DgDj, 

which boils down to proving that the second order polynomial 

a - 2a 2 - 2(p 00 + p n )X k + (a - 2a 2 )X^ 

is non negative. Similar arguments as above yield that the preceding polynomial is nonnegative as 
soon as a > max{2o~ 2 , 2<7 2 , v}. Thus, for u = (ui, it 2 , U3, u^) 1 £ R 4 the following lower bound holds 

Dk f v D k 



t6G„ 



u*(a*„S^-r n )u = uMDl) 1 ' 2 + u 2 5 2k X k {Dl) 1 ' 2 - u 3 S 2k+1 * - u 4 S 2k+1 X k 



(£,0)1/2 * "(£>0)V 



2 



/ fD 01 1 2 \i/2 / ('D 01s l 2 \ i/2\ 

«aW(^-^-) +uJ 2k+1 X k (Dl-^) j 



> 0, 



hence T„ < a3>„«l>„. To obtain the expected result, we use Riccati equation (see e.g. Lemma B.l 
of [3]) and the definition of E„ to obtain 

S- 1 = E-ij - S-i^n^a. + Z^-^E^, (5.8) 

where Z„ = <&^E~_ !<!>„.. By multiplying both sides by we obtain 

= S-i^n - S-^# n (l2n +Z„)- 1 (I 2 » +I„-l2-), 

= E~ 1 _ 1 * n (I 2 r. + Z n ) _1 . 

In particular, as Z„ is positive definite, one obtains 

E" 1 *^- + Z n ) 1 / 2 = E-i^Ian +Z n )-^ 2 . 

Taking the square of each side of the above equation then yields 

E-^nOfe. +Z„)*^„ 1 = S-\*„(I 2 » +Z n )" 1 * t n E^ 1 = E-ii - E- 1 , 

by Equation (|5.8p . As I 2 »> + l„ > I 2 « in the sense of positive semi-definite matrices, one obtains 
H~\ - E" 1 > E" 1 ^*^" 1 . This inequality together with T n < a*„*^ yield the result. □ 

Lemma 5.7 Under assumptions (H.l-5) and if n > 4, /or i € {0, 1} and o £ {0, 1}, one obtains 
sup {m" 2n E[( £ ^e 2fc+l ) 4 | jf\ } < 00 a.s. 
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Proof The following inequality is easily proved 

feGG* feeG* 

E W#(i + + *2 + * 3 + * 4 )> 

where C is a constant depending only on the moments of (£2, 772, £3, 773) up to order 4. The result 
follows from Proposition ^. Ill □ 

We have now proved that Assumptions (A. 1-6) of Theorem [5T] hold for the martingale (M n ) and 
the sequence of positive definite matrices (S n ) = (£„), thus we obtain the following result. 

Proposition 5.8 Under assumptions (H.l-5) and if n>A, one obtains 

M t n Y,-\M n = 0{n) 1 and \\M n \\ 2 = 0(nm n ) a.s. 

In addition, for all S > 1/2, ||Af„j| 2 = o(n s m n ) a.s. and 

1 ™ 

lim 1 { | G . !>0} - V MlUj^Mt =tr(T'E- 1 )l I , a.s. 

1=1 

Now recall that 9 n — 6 = S~_iM n . One then readily obtains Theorem 13.21 

Proof of Theorem [372] As 9„-0= S~\M n , one obtains 

||0„ - Of = MiS-^Mn < ||M„]| 2 A max (S-! 1 ). 

where A max (S'~ 2 1 ) denotes the highest eigenvalue of matrix S~ 2 _ 1 . We use Proposition 14.141 to 
conclude that \\0 n — 8\\ 2 = o(n s m~ n ) a.s. For the quadratic strong law, Proposition 15.81 yields 

1 " ^ 

lim l { | G .|>o}- Y^iOi-OySi-xIl^St-iiet-g) = tr(rS _1 )l F a.s. 

1=1 

and the result is obtained by using Proposition 14. 141 and Lemma 15.51 A similar argument as in the 
proof of Lemma 1^721 is used to replace Si-iEJ^St-i by its equivalent SY,^ 1 S . □ 

5.3 Rate of convergence for a n 

We proceed in two steps. Recall that 

°n = C^ri-l E (^2fc + e L+lJ 2 ^fc^2 fe , 2A fc ?2 fe+1 , X^(?2fe + e2fc+l)) I 
fcGT„_i 

is our estimator of a = (o 1 , poo; Piij c 2 ,)*, an d 

°Vl = ^" rl -l E ( £ 2fc + e L+li 2 ^fe e 2fe7 2 ^fe e 2fe+l! ^fc( e 2fe + e 2fc+l)) • 

Our first step is to prove the convergence of er n to <x. The second step is the convergence of a n — er n 
with a convergence rate. 
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5.3.1 Convergence of cr n 

The convergence of <x„ to cr is directly obtained using the usual law of large numbers for square- 
integrable vector- valued martingales. Note that one could obtain a convergence rate under stronger 
moment assumptions using Theorem 15. II 

Lemma 5.9 Under assumptions (H.l-5) and if k > 8, the following convergence holds 

Um l/| G .| >0 ><r„ = <r% a.s. 

n— >oo 



Proof : Set 



n-l 

Ml = U n _x (cr n -CT) = E E 



4k + 4 k+ i-n4k + 4k+i\^f] \ 

2X k (4 h -E[4 k \Jf]) 
2X k (e% k+1 -E[e| fc+1 | 

V x 2 k (4 k + 4 k+ i-m4 k + 4 k+ i J 



Hence, (M^) is a square-integrable (.F^f )-martingale. One can compute all the entries of its 
predictable quadratic variation and prove that they all equal a constant (depending on the moments 
of {e2,V2,£3,m) U P to order 4) multiplied by J2 ^2k+iX k or J2$2kfak+iX k with q < 8. Hence, the 
laws of large numbers given in Proposition 14 . 1 ll ensure that m~ n < > n converges almost surely 
to a constant matrix on the non-extinction set £ . The standard law of large numbers for square- 
integrable martingales then implies that (M^) = o(m n ) a.s. Besides, m~ n U n also converges to 
a fixed matrix on the non-extinction set £ by Proposition 14.141 Therefore 1{|g* |>o}( cr n — = 
1{|G* \>o}U~^ 1 M? l tends to a.s. when n tends to infinity. □ 



5.3.2 Convergence of a n — cr n 

We now turn to the convergence of ar n — <j n . One can rewrite U n -\{a n — cr n ) as 

U n -i(cr n — <x„) = P^ + 2Rl, 



(5.9) 



with 



E 



/ (C2fc - £2fe) 2 + (&2k+l - £2fc+l) 2 \ 

2Xk{t2k — £2fc) 2 
2Xk(e2k+i — £2fe+i) 2 
V Xli&k - e 2 fc) 2 + fen - £2fc+i) 2 ) J 



and 



/ £2fc(e2fe — £2fc) + e2fc+i(e2fe+i — £2fc+i) 

2^fe£2fc(e2fc — £2fe) 

2Xfce2/£+i(e2fc+i — £2fc+i) 

\ -^ 2 (e2fe(£2fc — £2k) + £2k+l(t2k+l — £2fe+l)) / 

We are going to study separately the asymptotic properties of P^ and R°. 
Lemma 5.10 Under assumptions (H.l-5) and if k > 4, one obtains 

lim l { | G .|>o}- E ( ? 2fe - e 2k f = g (0)% = (m - lJ^CS )" 1 )^ 



a.s. 



feeT„ 



where T° is defined in Lemma \5.4\ and S° in Proposition \4-M\ 

Proof : We are going to apply Theorem 15.11 to the first two entries of the martingale (M n ). 
Indeed, let M° n be the 2-component vector corresponding to the first two entries of M n 



E 



fe€T„ 



^2k 
Xkt2k 
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Let 0° = (a, by, 8 n = (a„,6„)*. Clearly, one has (0° - 0°) = (S ^^' 1 M° n , therefore one obtains 

E - e ^) 2 = E - f x k xl ) ( § « - "°> 

Set also S° to be the 2x2 matrix defined by 

s° = Ed + ^(A £5 

Thus, Proposition 14. 141 and Lemma 1531 yield 

lim l {|G .| >0} (S^) 1 /2 ( 5«)- 1 (^ l+1 -5«)(^)- 1 (S«) 1 / 2 = A % a.s. 

n— >oo 

where A = (m - 1)(S°) 1 / 2 (S ,0 )" 1 (S ) 1 / 2 . Note that A is positive definite and the matrices 
£°, S°, S°, S and A commute. We now use Theorem PI for the martingale (M°), with the 
sequence (S„ = (A ) _1 ' 2 E„ (A ) -1 / 2 ) . As A is a fixed positive definite matrix, it is clear that 
all the assumptions of Theorem 15.11 hold, as in Section 15721 Thus one obtains the a.s. limit 

1 - 

lim l {R|>0) ^(MS) f ((A»)- 1/2 E ( _ 1 (A°)- 1 / 2 )- 1 M5 = tr(T° p )- 1 A°)lj = (m - l)tr(r°(SV). 

£=1 

Finally, a similar argument as in the proof of Lemma 15.21 is used to replace A° by its asymptotic 
equivalent (S°) 1 /2( 5 °J-i(5° +1 - S° )V 2 to obtain 



1 1 " 

lim 1 { | G .|> 0} - E ~ £ 2fe) 2 = lim 1 { | G .|>0}- E( M °) t ( S °-i) _1/2A °( S °-i)" 1/2M ° 

feeT„ £=1 

n 

lini % ; |>or E( M °)* (( A °)" 1/2s °-i ( A°)" 1/2 ) " M?, 



n— >oo 711 11 



hence the result. □ 
A similar proof yields the following results for odd indices. 

Lemma 5.11 Under assumptions (H.l-5) and if k > 4, the following convergence holds 
lim 1 { |g;|>0}- E ( ?2fe +! - 62k +^ 2 = = ( m ~ ^H? 1 (S 1 )' 1 )!? a.s. 



n 

fceT„ 



where T 1 is defined in Lemma \5.4\ and S 1 in Proposition \4 ■ lj\ 

The proof of Lemma [5.101 can also be adapted to obtain the two following results. 

Lemma 5.12 Under assumptions (H.l-5) and if K > 4, for all i 6 {0,1}, the almost sure con- 
vergence holds 



lim 1 { | G .| >0 }- E X ^2k+i - e 2k+l ) 2 = 9i (l)% = (m - l)tr(r i (S i )- 2 T i )l 2 

fc6T„ 

where T" is £/ie 2x2 matrix defined by T" = [ «*/q\ 

V <-i(2j £i(,o) 
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Proof : We prove the result for i = 0, the other case being similar. With the notation of the proof 
of Lemma 15.101 one obtains 



E - ^ = E - ( £s xi ) - 6 



0-\ 



feSG„ fceG„ 

( X . X 2 \ 

with T° = X^fceT ^ 2k y2 yl ) • Proposition 14. 1 II yields the following convergence 
lim 1 {|G , |>0} (E°) 1 / 2 (S°)- 1 (T° +1 - I^)^)- 1 ^) 1 ^ = A °% «■«• 

n— >oo 

with a new matrix A defined by A = (m - 1)S°(5 ,0 )" 2 T°. Note that this new A is again 
positive definite and the matrices S° , S° n , T° , S°, 5°, T°, and A still commute. The end of the 
proof is similar to that of Lemma \5 . 101 with the new matrix A . □ 

Lemma 5.13 Under assumptions (H.l-5) and if k > 4, /or a/? i G {0,1}, t/ie almost sure con- 
vergence holds 

lim 1 { | G .|>0}- E **@»*-H - ^k+if = 9,(2)% = (m - l)tr(r (^)~ 2 ^)%, 



71 

fcex. 



where W l is the 2x2 matrix defined by W z 



ii(2) ti(3) 
4(3) ti{A) 



Lemmas 15.101 l5~TTl 15. 121 and 15.131 give the almost sure convergence of the sequence (P^)- 
Lemma 5.14 Under assumptions (H.l-5) and if n > 4, the following convergence holds 

lim 1 {IG , J>0} -PZ = ( O o(0) + fl i(0),2qo(l),2gi(l),qo(2) + <7i(2)) t % a.s. 
It remains to give the limit of the sequence (R^). 

Lemma 5.15 Under assumptions (H.l-5) and ifn>8, the following convergence holds 

lim l{| G *|>o}--R^ = a.s. 

Proof : It is sufficient to prove that (R^) is a martingale and that its predictable quadratic 
variation is almost surely 0(n). For all k G G n , one obtains 

E[e 2k (t 2k - e 2k ) | J%\ = 5 2k ((a-a n ) + (b - b n )X k )E[e 2k \ T°\ = 0, 

and we have the same result for the other entries of R^. Hence, (R°) is a (.F^-martingale. It is 
also square- integrable. We are going to study (R*) component-wise. We give the details for the 
last entry, the others being treated similarly. For i G {0, 1}, set 



Qn = EV - OJ E ( J ) e 2k+l 



The last entry of R^ can then be rewritten as Q° + Q\. The processes Q l n are clearly (J 7 ® )- 
martingales with predictable quadratic variation equal to 

n-l 
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with Aj t = £ feeGn S 2k {a 2 £ + 2 Pll X k + o\Xl) ^ *| ^| ^ . Thanks to Proposition EH the 

sequence of matrices (S^) 1//2 (S'^_ 1 )~ 1 AJ l (S'^_ 1 ) _1 (S^) 1 / 2 converges almost surely on the non- 
extinction set £ to a fixed positive definite matrix A 4 . We now use Theorem 15. f I along the same 
lines as in the proof of Lemma f5.f 01 to obtain that < Q % >„= 0(n), and thus Q l n = o(n). The 
other entries of (R^) are dealt with similarly, yielding the result. □ 



Proof of Theorem 13.31 It is a direct consequence of Eq. ()5.9|) . Proposition 14. 141 and Lemmas 15. 9[ 
EH and EH] □ 

5.4 Rate of convergence for p n 

We proceed again in two steps. Recall that 

Pn = ( 6 2fc £ 2fc+l, 2Xfee 2 fce2fe+l, ^fc£2fe£2fc+l) , 

fcGT„_i 

is our estimator of p = (p e , p, p v Y, and 

Pn = Vn—1 ^2 { e ^2k+l,2X k e 2k e 2k+ i,X k € 2k e 2k+ i) . 

Our first step is to prove the convergence of p n to p. The second step is the convergence of p n — p n 
with a convergence rate. 

5.4.1 Convergence of p n 

The convergence of p n to p is again directly obtained using the standard law of large numbers for 
square-integrable vector-valued martingales. Note that one could also obtain a convergence rate 
under stronger moment assumptions using Theorem 15.11 

Lemma 5.16 Under assumptions (H.l-5) and if k > 8, the following convergence holds 

lim l { | G .| >0} p„ = plj a.s. 

n— >oo 

Proof : Set 

/ e2fe£2A;+l ~ E[e 2k € 2k+ i \ ?f\ 

MP = V n ^(p n -P) = EE 2X fe (e 2fc e 2fc+1 - E[e 2k e 2k+1 \ J?]) 
i=ikeG e \ Xl(e 2k e 2k+1 -E[e 2k e 2k+1 \ Tf ]) 

Hence, (M%) is a square-integrable (J r ^ ) )-martingale. One can compute all the entries of its 
predictable quadratic variation and prove that they all equal a constant (depending on the moments 
of (e 2 , r) 2 , £3, 773) up to order 4) multiplied by ^26 2k S 2k +iX% with q < 8. Therefore, the laws of 
large numbers given in Proposition 14.111 ensure that m~ n < M p > n converges almost surely to 
a constant matrix on the non-extinction set £. The standard law of large numbers for square- 
integrable martingales then implies that (iW£) = o(m") a.s. Besides, mT n V n also converges to a 
fixed matrix on the non-extinction set £ by Proposition 14.141 Thus p n — p = V^_ X M^ tends to 
a.s. on £ when n tends to infinity. □ 

5.4.2 Convergence of p n — p n 

We now turn to the convergence of p n — p n . We follow the same steps as in Section [5.3.21 One 
can rewrite V n -i(p n — p n ) as 

Vn-l(p n ~P n ) = P n +R P n , With (5.10) 

P n = ( ? 2k - e 2k )(? 2k+1 -e 2k+1 )(l,2X k ,X 2 k y , and 

fceT„_i 

Rn = ^ (e2fc+l(£2fe — £2fc) + £2fc(e2fc+l — e 2fc+l)) (l, 2Xfe, X k ) . 
feGT„_i 
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We are going to study separately the asymptotic properties of and R^. The limit of Rf^ is 
obtained as in Lemma 15. 151 

Lemma 5.17 Under assumptions (H.l-5) and if K > 8, one obtains 

lim l { i G .| >0} i.R£ = a.s. 

Lemma 5.18 Under assumptions (H.l-5) and if k > 4, the following almost sure convergence 
holds 



lim 1{|G " I>0} V (e 2k - e 2k )(? 2k+1 e 2k+1 ) = g 01 (0)% = 7 ^-A t r{T S~ 2 J° l )tj 



£ 

feeT,, 



oi _ / S 01 \ , c0 i _ ( 4i(0) 4i(l) 



where J = ^ g01 Q j , and S - y ^ V(2) 
Proof : First, notice that for all k € G n , the following decomposition holds 



2(e~2fc — £2fe)(e2fe+i — £ 2fe+i) = o~2k8 2k +i(0 n — 0) 



I 1 x k \ 

x k x\ 

1 X k 

V x k xi o o y 



(0« - o) 



Hence, one obtains 



2 ( ?2fe " £ 2fc)(Wi - £2fc+i) = M l S i-l( J T - jf-i)Sj-i M l 

fceT„ i=\ 



with J" 1 = ^ a » J and = E fceT „ 2S 2k 5 2k+1 ( ^ ^ ) . Set A„ = E^V^i- 
J^S' 1 ?,)! 2 . Proposition HH] yields 



lim l/| G ,i >0> A„ = A% = (m- lJE^ST 1 J^ST 1 !] 1 / 2 !- a.s. 

n— >oo 71 

Hence, as in the proof of Lemma l5.10[ it is sufficient to study the convergence of 5Z/=i -^l^^i 2 AE £ _ 
To this end, we apply Theorem 15.11 to the martingale (M n ) and with the sequence of matrices 
3„ = S^ 2 A _1 Sy 2 . As we have seen before, assumptions (A. 1-3) and (A. 6) hold. Assump- 
tion (A. 4) is a direct consequence of Lemma \5. 51 We will not investigate Assumption (A. 5) but 
directly prove Lemma l5.3l a part of Theorem 15. 11 in our specific context that is 

n 

B' n+1 = 2^M i ^ 1 AM w = o(n). 
Note that the matrix J 01 has a special property, namely 



[S 01 ) 2 \ _ f S 01 \ _/ r ois2 



(S Ui ) 2 
(S 01 ) 2 

so that although J 01 is not positive definite, J 01 is. As a result, as the matrices J n , S„ and S„ 
commute, one has A 2 = (A') 2 with the positive definite matrix A' = (m- 1)S 1/2 5 _1 / 01 5 _1 S 1/2 . 
We have seen that the coefficient a given by Lemma EH satisfies S- 1 r„S i ; 1 < a(S"l 1 - S" 1 ). 
In view of the property of J 01 and the fact that the matrices A, A', H n and T n commute, we 
obtain 

(S^A-^^j^r^S^A-^y 2 )- 1 = A'E-^E^A' 

< aA'^-^-S- 1 )^. 
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Thanks to Lemma 15.51 we have the convergence 

.. A max (S„_ 1 — S n ) \ max (£ 1 ) 

km : r- = 7— > 0. 

"rmnx^-'n— 1 n I ^mm^ I 

So there exists no > and /3 > such that 

P\min(En-l ~ ^r^) > \ m ax{^n-l ~ S n 1 ) \nax (( A') 2 ) for 71 > 77 , 

which implies A'(S-^ - A' < ^(S^j - S" 1 ), and 

E[A< 2 +1 |J-f ] = AM t n S- 1 T n S- 1 M n < apMlpn-i - S^)Af n for n > n . 

Af^ l (S^'_ 1 — X~ )_M„ is the increment of A n defined in Eg 15.11 when S„ = S n . By the argu- 
ments used in the proof of Proposition 15.81 ^™ =l M\(Yij\ — Hj^Mi = 0(n) which implies 
B' n+1 — o(n). We can then apply Theorem 15.11 which immediately yields the result. □ 

Similarly, one obtains the two following convergences. 

Lemma 5.19 Under assumptions (H.l-5) and if k > 4, the following almost sure convergence 
holds 

Um M\G-J>0} J2 Xk (i 2k - e 2k )(t 2k+1 - e 2k+1 ) = ?0 i(l)% = ^^ir(rS- 2 X 01 )% 

fcGT„ 



where K Q1 = | ^" 01 ) and T 



oi _ ( 4i(l) 4i(2) 
T U1 y """" _ I 4i (2) 4i(3) 



Lemma 5.20 Under assumptions (H.l-5) and if k > A, the following almost sure convergence 
holds 



lim 1 {|G;|>0}1 J2 X 2^ _ £2fc)(?2fe+1 _ e2k+l) = qm{ 2)l T = r -^-±tr(rS- 2 L 01 )l i 



S 

fcGT„ 



where L<»=( ° fl^^^ M3) 



W 01 y " " V 4i(3) 4i(4) 
Thus, one obtains the following limit. 

Lemma 5.21 Under assumptions (H.l-5) and if k > 4, i/ie following almost sure convergence 
holds 

1— / , ^t. 



lim 1 { | G ,| >0} -P^= (goi(0),2g i(l),9oi(2)) 1 



n— foo 71 77, 



a.s. 



Proof of Theorem 13. 4I It is a direct consequence of Eq. (|5.10p . Proposition ^. 141 and Lemmas l5.161 



6 Asymptotic normality 

To derive the central limit theorems (CLT), we use a CLT for martingales given in Theo- 
rem 3. II. 10]. To this aim, we use a new filtration. Namely, instead of using the observed generation- 
wise filtration, we will use the sister pair-wise one. Let 

Qf = OV*{8 1 X 1 , (S 2k X 2kl S 2k+1 X 2k+1 ), 1 < k < p} 

be the er-algebra generated by the whole history O of the Galton- Watson process and all observed 
individuals up to the offspring of individual p. Hence (S 2k e 2k , S 2k+ ie 2k +i) is C/JP-measurable. In all 
the sequel, we will work on the non-extinction probability space (£,W-g) and we denote by E^r the 
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corresponding expectation. 



Proof of Theorem 13. 5|, first step: For a fixed integer n > 1, let us define the ^-martingale 
(M(")) {p > 1} by 

1 p 

Ml n) = mY^ Dk with D k = (e 2k ,X k e 2 k,e2k+uX k e 2k+1 ) t . 



IT 



1/2 
n I fe=l 



Indeed, under (H.l-5), D k is clearly a ^-martingale difference sequence. Set v n = |T„ = 2 n+1 — 1. 
Therefore the following equality holds 

|T„| 

M &? = — T7I E D * = — T7I M »- 

it*i 1/2 ^ i^i i/2 

We want to apply Theorem 3. II. 10 of [12] to the process (M^). As the non-extinction set Z is in 
Q k for every fc > 1, it is easy to prove that 

1 X k 



^[DkDllGti} = nD k D k \g°_ t ] = 7fc ( 
where ~{ k is defined in Eq. (|5.7|) . Lemma WM gives the P^ almost sure limit of the following process 

li * " n-¥oo 



a.s. 



fceT* 

Therefore, the first assumption of Theorem 3. II. 10 of [T^] holds under Pg-. Thanks to 7 > k > 4, 
we can easily prove that for some r > 2, one obtains 

supE[||D fe |r|ef_i] < 00 a.s. 

fe>0 

which in turn implies the Lindeberg condition. We can now conclude that under P^ the following 
convergence holds 

f~~tt72 E D U = ^± j75 M n -£>w,n 



IT* 



Finally, result (|3.3p follows from Eq. ()5.5p and Proposition 14. 141 together with Slutsky's Lemma. □ 

Proof of Theorem 13.51 second step: We apply Theorem 3. II. 10 of Q~2] again to the sequences 
(M£ r,T, )){p>i.} of ^-martingales defined by 



|t; 1 _ 1 | 1 / 2 m^ = £z)^^ 



fc=i 



fe=i 



2X k (el k -E[el k \F°]) 
2X k (el k+1 -E[el k+1 \T? k }) 
V Xl(el k + el k+1 -E[el k +el k+1 \T°}) J 



1/2 7i^(<r,n) 



where r k denotes the generation of k. Set v n = |T„_i| = 2" — 1. One obtains |T*_2| M]^ 
U n -i{cr n - a). We have to study the limit V 7 of the process EllI 1 (^fe )' I 

In order to compute the conditional expectation, let us denote, for k > 1 

4 



A)i(fc) 



<5 2fc+i ( C l Ml - i)(4 - r), (1 - i)r, i(4 - r), ir)X£ 

r=0 

+ 4 PlJ a e 2 X fe + (4p| + 2o*o*)*2 + 4putfX 3 k + affl) 
2 2 

S 2k S 2k+1 ( 2 2 <7£Cf 0(2 - r, r, 2 - s, s)X r k +s 

r=0 s=0 

-(^ + 2a 2 (p o + + (2(7*0* + 4p oPii)^ 2 + 2a 2 (p o + Pn)*g + <A, 4 )), 
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and Bi(k) — Ai(k) + Aoi(fe). Using these notations, we obtain 



( (B Q +B 1 )(k) 2X k B (k) 2X k B 1 (k) X%(B + B^(k) \ 

2X k B (k) ±X 2 k A (k) AX 2 A m (k) 2XiB (k) 

2X k B 1 {k) 4X 2 k A 01 (k) 4X fc 2 A 1 (fc) 2X^(k) 

\ X 2 (B + B,)(k) 2X 3 k B (k) 2X*B x (k) X^B + B 1 )(k)J 



The almost sure limit of the above quantity is given by Proposition 14.111 Indeed, the following 
convergences hold 



lim — 



— ]T Mk)X q k = K and lim — — £ A Q1 (k)Xl = A q 01 a.s. 

™^°° I u n-l I 



fc=l 



fe=l 



with 
-4'/ 



^01 



^C{6{{1 - - r),{\ - i)r,i{A - r),ir)li{r + q) 

r=0 

-(a%(q) + \ Pvl alt l {\ + q) + (4p% + 2o*o*)4(2 + q) + 4pu**li(3 + q) + *fa(4 + q)4), 

2 2 

E C r 2 C s 2 6(2 -r,r,2-s, s)£ 01 (r + s + q) 



r=0 s=0 
4 



-(a% x (q) + 2a*(p 00 + pii)ioi(l + ?) + (2of + 4p oPn)4i(2 + g) 
+2(7^ (poo + Pn)4i(3 + 9) + ff^oi(4 + g)). 

We also set B\ = A\ + A^. With these notations, we are able to explicit the limit matrix T a of 
the process ^ El^ 1 %[^(^)' I 



/ i?° + £?" 2Bi 2BJ #o + B ? \ 



2i? 1 



iA 2 4A 2 m 



2Bl 



2B\ \A 2 m \A\ 2B\ 
V B 2 + B\ 2Bl 2B\ B$ + B\ J 



(6.1) 



The first assumption of Theorem 3. II. 10 of [T2] holds under Pg- and we easily prove the second one 



to conclude that under Pj 



M%r= K-i\~ 1/a 



We conclude using Proposition 14. 141 and Theorem [37 



£ D% = \X l _ 1 \-^U n ^{<r n -a) An(0,T°). 

fcGT„_! 



□ 



Proof of Theorem l3.5|, third step: We apply again Theorem 3. II. 10 of [H] with to the sequence 
of ^-martingales (M^ n) ) {p > 1} defined by 

P P I £2k£2k+l ~ E[e2fc£2fc+l | -TvJ 

\T* n _^M^ =J2 D k=Y,\ 2X k (e 2k e 2k+1 -E[e 2k e 2k+1 | jg]) 
fe=i k=i \ Xl(e 2k e 2k+1 -E[e 2 fc£2fc+i | 

Set v n = |T n _i| = 2" - 1. Thus one can rewrite IT^-J^M^'™ 5 = V n -i(p n - p). Let us denote 

C{k) = 5 2k S 2k+1 ((0(2, 0, 2, 0) - pi) + 2(9(2, 0, 1, 1) + 6(1, 1, 2, 0) - 2pp s )X k 
+ (6(0, 2, 2, 0) + 6»(2, 0, 0, 2) + 40(1, 1, 1, 1) - 4p 2 - 2p £ p„)x£ 
+2(0(0, 2, 1, 1) + 6(1, 1, 0, 2) - 2 W7) )X| + (6(2, 0, 2, 0) - p 2 )xf 
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We are now able to write 

/ 1 2X k Xl 

^[D^DlY\Gti]=C{k)\ 2X k AXl 2X1 

\ xl 2X1 K 

For the determination of the limit T p of IT*.-^ 1 X^Ili 1 fcC^k)' \ G®_i\, let us remark, using 
Proposition 14.111 that 

with 

C« = (0(2, 0, 2, 0) - p 2 e )e Q1 (q) + 2(0(2, 0, 1, 1) + 0(1, 1, 2, 0) - 2p Pe )£ 01 (l + q) 
+ (0(0, 2, 2, 0) + 0(2, 0, 0, 2) + 40(1, 1, 1, 1) - 4p 2 - 2 Pe p n )t m (2 + q) 
+2(0(0, 2, 1, 1) + 0(1, 1, 0, 2) - 2pp n )l m (Z + ?) + (0(2, 0, 2, 0) - p 2 )£ i(4, + g). 

The matrix F p is thus given by 

C° 2C 1 C 2 
= ( 2C 1 4C 2 2C 3 (6.2) 
C 2 2C 3 C 4 

The first assumption of Theorem 3. II. 10 of [12] holds under Pg and we easily prove the second one 
to conclude that under P^ 

M* n = \T* n _ 1 \-v 2 J2 ^ = in-ir 1/a v r n-i(p T ,-p)^^(o ) r"). 

We conclude using Proposition 14. 141 and Theorem 13.41 □ 



7 Application to real data 

We have applied our estimation procedure to the Escherichia coli data of [2S] (these data are 
available on request from the corresponding author of [2S])- E. coli is a rod-shaped bacterium that 
reproduces by dividing in the middle. Each cell has thus a new end (or pole), and an older one. 
The cell that inherits the old pole of its mother is called the old pole cell, the cell that inherits the 
new pole of its mother is called the new pole cell. Therefore, each cell has a type: old pole (even) 
or new pole (odd), inducing asymmetry in the cell division. Stewart et al. |25| filmed colonies of 
dividing cells, determining the complete lineages and the growth rate of each cell. Several attempts 
have already been made to fit BAR processes to these data, see |14 l fT ^ [TT ] [9]. but only with fixed 
coefficients models. In particular, [T3] suggests that such models cannot explain all the randomness 
of the data. 

We have run our estimators on the data set penna-2002-10-04-4 from the experiments of |25) . 
It is the largest data set of the experiment. It contains 663 cells up to generation 9 (note that 
there would be 1023 cells in a full tree up to generation 9). For each of the 663 observed cells, 
the measure of the growth rate is available. For each cell, its length was recorded from birth to 
division and the corresponding growth curve was fitted by an exponential function t >— > exp(Ai) 
where A is called the growth rate of the cell. Growth rates go from the minimum value 0.009 to 
the maximal 0.067. The 0.01-quantile equals 0.024 and the 0.99-quantile equals 0.049. Mean and 
median equal 0.037 (std: 0.004). Note that even though the number of observed generations n = 9 
is low, the rate of convergence of our estimators is |T* | -1 / 2 which is of order ir~ n / 2 . Here for 
n = 9, |Tg| = 663. Table [T] gives the estimation Og of 9 with the 95% Confidence Interval (C.I.) 
of each coefficient. Note that our estimator 9 n of 9 is exactly the same as in [5], and of course we 
obtain the same point estimation. The confidence intervals are wider, as the variance is different. 
More precisely, the variance is given by the CLT for 9 in Eq. (|3 ,3[) . We have approximated it by 
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a 


b 


c 


d 


0.0363 
[0.0275,0.0450] 


0.0266 
[-0.2094,0.2627] 


0.0306 
[0.0216,0.0396] 


0.1706 
[-0.0709,0.4120] 



Table 1: Estimation of 6 on the data set penna-2002- 10-04-4 



4 


< 


0.0004 
[-0,0002,0.0010] 


0.2431 
[-0.0750,0.5613] 



Table 2: Estimation of noise variances on the data set penna-2002-10-04-4 

S~ s 1 TgS~ s 1 thanks to the convergences given in Proposition 14.141 and Lemma [5.41 Table [2] gives 
the estimation of the variance coefficients o\ and of er (other covariance coefficients of <r 9 and 
p 9 can be computed but are less easy to interpret). The variance of these parameters is again 
given by the central limit Theorem 13.51 To obtain confidence intervals, one needs an estimation 
of the joint moments of (E2, £3, 773) up to the order 4. Such estimators can be easily derived 
following the same ideas as in Section I3TT1 A Wald's test for the positivity of cr| (resp. er^) can 
be derived from Theorem 13.51 It rejects the null hypothesis Hq : a\ = (resp. Hq : cr^ = 0) with 
p- value p = 0.0799 (resp. p = 0.0671). We are not far from supporting the validity of the random 
coefficients model. 
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